Semantic system for integrating software components

ABSTRACT

A method and a scripting paradigm for automatically integrating disparate information systems (e.g., web services and databases) within a given enterprise into a service-oriented architecture. A script writer generates a script using a scripting paradigm, and the resulting script automatically derives new data models, new ontological structures, new mappings, and a new web service that integrates disparate information systems. In addition to integrating disparate information systems, the scripts may be harvested to automate the metadata discovery and retrieval process. The scripting paradigm builds upon existing open-source scripting languages and is compatible with existing internet browsers, thus encouraging mass participation in the integration process.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation-in-part of pending U.S. patentapplication Ser. No. 11/377,459, filed on Mar. 17, 2006, which is herebyincorporated by reference in its entirely, and claims priority to U.S.Provisional Patent Application No. 60/873,248, filed on Dec. 7, 2006,and to U.S. Provisional Patent Application No. 60/900,312, filed on Feb.9, 2007, which are both incorporated herein by reference in theirentireties.

STATEMENT REGARDING FEDERALLY-SPONSORED RESEARCH AND DEVELOPMENT

The U.S. government has a paid-up license in this invention and theright in limited circumstances to require the patent owner to licenseothers on reasonable terms as provided for by the terms of Contract No.FA8721-07-C-0001 awarded by the United States Air Force.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a system for integrating softwarecomponents using a semantic ontology management system. The presentinvention further relates to a system for integrating disparate,information systems into a service-oriented architecture to provide datainteroperability across an enterprise.

2. Background Art

Web service standards are enjoying widespread adoption in corporationsacross many industries. Corporations are recognizing the value of makingit easier for other applications to consume their data by using webstandards such as the Hyper Text Transfer Protocol (HTTP), webaddressing, and Extensible Markup Language (XML). Using these standards,software clients written in one programming language can access andretrieve information from a server, irrespective of the technology(e.g., hardware, operating system, and programming language) that theserver uses.

However, even with the adoption of these web standards, problems remain.For example, although XML is mature as a syntax for web data exchange,current XML technologies do not supply the capabilities provided by moremature technologies like relational database systems. Also, whilesolutions that aid in web service discovery (e.g., UniversalDescription, Discovery and Integration (UDDI), as described athttp://www.uddi.org/specification.html, incorporated by referenceherein) and invocation (e.g., Web Services Description Language (WSDL),as described at http://www.w3.org/2002/ws/desc, incorporated byreference herein) are emerging, they are far from mature. Similarly,technologies that reason with web service description files for thepurpose of chaining web services are not available. It is left to theprogrammer to determine, at design time, which web services to invoke,the order in which they need to be invoked, and the formatting ofinformation necessary to complete an operation. As a result, theprogrammer writes much of the “glue code” necessary for a particularcomputation. A need still exists for methods and systems that allowautomatic generation of such “glue code,” as well as automatic discoveryand integration of available web services and data repositories.

Further, large enterprises are moving towards a service-orientedarchitecture strategy to make data repositories and web services fromone part of the enterprise available for use in another part. However,these service-oriented architectures often fail to address the semanticsof individual information systems. The representation of position on aparticular information system may lack a reference datum, and hence,position data one information system may not be easily accessed andutilized by other information systems. Another example of such asemantic mismatch is the representation of time on an individualinformation system, which may lack the essential designation of a timezone. The failure of service-oriented architectures to recognize andaddress data semantics and semantic mismatches leads not only toexecution errors, but also to lengthy testing and integration cyclesthat find and correct the resulting errors (see Herrera, X., “The bottomline for accurate massed fires: common grid,” Field Artillery Journal,pg. 5-9, January-February 2003, which is incorporated herein byreference).

Existing web service standards are unable to address fully the issuesfacing modern enterprises during the integration of disparateinformation systems. Common semantic markup languages (e.g., OWL-S:Semantic Markup Language for Web Services, as described athttp://www.w3.org/Submission/OWL-S, incorporated herein by reference)are not sufficiently broad to offer a solution, and existing semanticframeworks (e.g., Web Service Modeling Ontology (WSMO), as described athttp://www.w3.org/Submission/WSMO, incorporated herein by reference) arenot sufficiently general to address the integration needs imposed byemerging behavior. Further, existing sets of OWL/RDF mappings arelikewise insufficiently broad to address the changing needs of theintegration process (e.g., Crubezy, et al., “Mediating knowledge betweenapplication components”, Semantic Integration Workshop of the SecondInternational Semantic Web Conference (ISWC-03), Sanibel Island, Fla.,CEUR, 82.2003, incorporated herein by reference). It is left to theprogrammer at the time of integration to determine the semantic andcontextual mismatches that exist between data representations onmultiple information systems, resolve these mismatches, and generate thecode that integrates the various information systems across theenterprise. Thus, a need exists for methods and systems to automaticallyintegrate the disparate information systems into a service-orientedarchitecture that provides data interoperability across the enterprise.

The issues that modern enterprises face while integrating disparateinformation systems are often exacerbated by an ongoing transition ofthe internet from a collection of isolated information sources to anaccessible network of highly interactive content and rich userinterfaces. On the back end, this transition is fueled by the ability ofdata management systems to instantaneously change database models and tosupport the automated migration of data instances to updated datamodels. On the front end, open-source scripting languages are enjoying arenaissance on the web, and the open-source creation of content is a keydriver of successful companies.

Further, large corporate and governments enterprises, such as theDepartment of Defense, are increasingly challenged by the requirementsof “netcentricity,” which mandates that stored information beaccessible, understandable, and interoperable across an enterprise.Unlike the internet, where data exists in unstructured, text documents,a large portion of the data within the Department of Defense is highlystructured and is stored within databases. As such, search engines thatdepend upon searching text documents, such as Google, often fail withindatabase-reliant enterprises. Thus, the capabilities to index and searchmetadata are virtually missing within the Department of Defenseenterprise, and contractors are often required to work closely withcommunities-of-interest to provide data interoperability betweenstructured databases on a case-by-case basis.

SUMMARY OF THE INVENTION

Accordingly, the present disclosure introduces methods andcomputer-program products for integrating data across multipleinformation systems.

According to various embodiments of the disclosed processes, a first setof mapping relations is developed to map entities in a source structureddata model to corresponding entities in a target structured data model.A second set of mapping relations is also developed to map entities inthe source structured data model and entities in the target structureddata model to one or more shared contexts. The first and second sets ofmapping relations are subsequently processed to generate a newstructured data model. Executable code is generated to publish the newstructured data model to implement workflow between the sourceinformation system and the destination information system. In anembodiment, the code executes source information to retrieve aninstance, translates the source instance into an instance that conformsto the new structured data model, and subsequently invokes a destinationservice to further populate the instance conforming to the newstructured data model. In an additional embodiment, services are invokedthat corresponding to the shared context to mediate mismatches betweenentities in the source structured data model and entities in the targetstructured data model.

Further features and advantages of the invention, as well as thestructure and operation of various embodiments of the invention, aredescribed in detail below with reference to the accompanying drawings.It is noted that the invention is not limited to the specificembodiments described herein. Such embodiments are presented herein forillustrative purposes only. Additional embodiments will be apparent topersons skilled in the relevant art(s) based on the teachings containedherein

BRIEF DESCRIPTION OF THE FIGURES

The present invention is described with reference to the accompanyingdrawings. The accompanying drawings, which are incorporated herein andform a part of the specification, illustrate embodiments of the presentinvention and, together with the description, further serve to explainthe principles of the invention and to enable a person skilled in therelevant art to make and use the invention.

FIG. 1 is a generic “triple” showing the relationship conveyed in graphsshown in this document.

FIG. 2 is a domain ontology for an address domain.

FIG. 3 is a partial ontology for an address database.

FIG. 4 is a generic upper ontology for a web service.

FIG. 5 is a portion of an augmented domain ontology D⁺.

FIG. 6 is a generic upper ontology for accessing a web service.

FIG. 7 is an instance of the ontology of FIG. 6 for a specific webservice.

FIG. 8 is a portion of an augmented domain ontology D++.

FIGS. 9, 10, 11 a, and 11 b show execution paths within D++.

FIG. 12 is a flow chart for a standalone web service whose glue code maybe generated by the Semantic Viewer.

FIG. 13 is schematic diagram outlining a method to integrate multiple,disparate information systems across an enterprise.

FIG. 14 is a portion of an Air Mobility (AM) system ontology.

FIG. 15 is a portion of an Air Operations (AO) system ontology.

FIG. 16 is a portion of a Position context ontology.

FIG. 17 is a flow chart of a method to integrate web service and contextontologies.

FIG. 18 is a portion of a GeoTrans web service descriptor language(WSDL) ontology.

FIG. 19 is a portion of an AO system ontology mapped onto a portion ofthe AM system ontology and the Position context ontology.

FIG. 20 is a portion of an AM system ontology onto which AM instancedata has been mapped.

FIG. 21 is an example of an initial workflow obtained after applying theDPQ and IIQ graph traversal algorithms to the mapped ontologies.

FIG. 22 is an example of a workflow returned by the Semantic Viewerafter processing an initial workflow with the Onto-Mapper to resolvemismatches.

FIG. 23 is an example of a preliminary workflow returned during theexecution of the GeoTrans web service.

FIG. 24 is an example algorithm which creates a new data instance on theAO information system domain and links that data instance tocorresponding AM data.

FIG. 25 is a schematic diagram outlining the creation of a new datainstance on the AO information system ontology using thehasMatchingValue link.

FIG. 26 is a schematic diagram outlining the creation of new datainstance on the AO information system ontology using the hasMatch link.

FIG. 27 is an example of a generalized execution path containingmultiple workflows.

FIG. 28 is a detailed flow diagram of an exemplary method forautomatically deriving a new data model that facilitates the exchange ofinformation between disparate information systems.

FIG. 29 is a detailed flow diagram of an exemplary method forautomatically deriving information system ontologies and mappings thatfacilitate the integration of disparate information systems.

FIG. 30 is an exemplary SWSL script that automatically implements datamodels and ontological structures necessary to integrate disparateinformation systems.

FIG. 31 is a second exemplary SWSL script that automatically implementsthe integration of web services, and can generate the data models andontological structures necessary to integrate disparate informationsystems.

FIGS. 32( a) through 32(e) are exemplary portions of an alignment blockthat may be incorporated within the exemplary SWSL script of FIG. 31.

FIG. 33 is a detailed flow diagram of an exemplary method forautomatically generating ontological structures necessary to integratedisparate information systems within the exemplary SWSL script of FIG.31.

FIG. 34 is a detailed flow diagram of an exemplary method for generatinga set of relations between concept ontologies that may be incorporatedinto step 3308 of the exemplary method of FIG. 33.

FIG. 35 illustrates a structure of the exemplary scripts of FIGS. 30 and31.

FIG. 36 is an exemplary computer architecture upon which the semanticsystem of the present invention may be implemented, according to anembodiment of the invention.

The present invention will now be described with reference to theaccompanying drawings. In the drawings, generally, like referencenumbers indicate identical or functionally similar elements.Additionally, generally, the left-most digit(s) of a reference numberidentifies the drawing in which the reference number first appears.

DETAILED DESCRIPTION OF THE INVENTION Definitions

As used herein, a “web service” is an application program (for example,a program implemented in Java or PHP on the World Wide Web) that acceptsdefined input(s) and returns defined output(s), and that exposes itsinterface, for example using Web Services Definition Language (WSDL).Example web services include the map servers found at MapQuest™ andYahoo! Maps™, online telephone directories such as Switchboard.com™, andonline weather services such as Weather.com™ and the National WeatherService site at http://www.nws.noaa.gov/. While the web servicesdiscussed in the specification are available on the World Wide Web, theterm “web service” is also intended to include services accessible onlyon internal intranets (such as an employee “facebook” or directory) oron standalone computers (such as an XML front end to a database orapplication program).

As used herein, a “database” or “structured data repository” is acollection of data having a formalism (i.e., structured data). Databasesmay be organized into “tables” of “records” having “fields” as isconventional in the art (see, e.g., Webster's New World Dictionary ofComputer Terms, 4^(th) ed., Prentice Hall, New York, 1992), or may bemore loosely structured (e.g., a structured text file or a tagged fileformat document).

As used herein, a “triple” or “RDF triple” is a statement that conveysinformation about a resource, and that can be represented as a subject,a predicate, and an object. As depicted herein, a triple is graphicallyrepresented as shown in FIG. 1, wherein subject and object nodes areconnected by a directional line representing the predicate. An examplestatement that could be represented as a triple might be “the book MobyDick (subject) has an author (predicate) whose value is Herman Melville(object).” Subject, predicate, and object can all be identified byuniform resource identifiers (URIs), such as a uniform resource locator(URL). Predicates are generally properties of resources, while subjectsand objects may be described as “concepts.”

As used herein, an “ontology graph” or simply a “graph” is a collectionof related triples that together convey information about therelationships between the concepts represented by the nodes of the graph(the set of objects and subjects of the statements represented by thetriples).

As used herein, an “ontology” or “domain ontology” is a dictionary ofterms within a given domain of knowledge, formulated in a known syntaxand with commonly accepted definitions, including relationships betweenthe terms (e.g., in the form of triples or an ontology graph).

Semantic Web Technologies

Today, Semantic Web technologies are beginning to emerge with promisesof enabling a much faster integration of applications and data (see,e.g., Gruber, “A Translation Approach to Portable Ontologies,” J.Knowledge Acquisition, 5(2):199-200, 1993; Guarino, et al., “Ontologiesand Knowledge Bases: Towards a Terminological Clarification,” in TowardsVery Large Knowledge Bases Knowledge Building and Knowledge Sharing, N.Mars, ed., IOS Press, Amsterdam, 1995; Bemers-Lee, et al., “The SemanticWeb,” Scientific American, May 2001; and Daconta, et al., The SemanticWeb: A Guide to the Future of XML, Web Services, and KnowledgeManagement, Wiley US, July 2003, all of which are incorporated byreference herein). However, for that to happen, Semantic Webtechnologies must facilitate access to large amounts of data withminimal programmer intervention. Web services must be discovered,chained, and invoked automatically, thus relieving the programmer fromhaving to do these steps. Semantic Web standards provide a richframework for the precise description of data and applications, therebyenabling greater automation in this end-to-end web service executionprocess. The World Wide Web Consortium (W3C) proposed recommendation fora standard web ontology language (OWL), as described athttp://www.w3.org/TR/2004/REC-owl-features-20040210/(incorporated byreference herein), builds on web technologies including XML's ability todefine customized tagging schemes and RDF's flexible approach torepresenting data.

Convergence between web services and Semantic Web technologies isbeginning, as illustrated by the OWL-Services (OWL-S) effort (seehttp://www.daml.org/services/owl-s/1.0/and pages linked therein, whichare incorporated herein by reference). OWL-S is an effort to develop aweb service ontology that could be used to describe the properties andcapabilities of web services in unambiguous, computer-interpretableform.

According to the invention, such Semantic Web technologies can be usedto discover execution paths spanning the retrieval of data fromstructured data repositories and execution of web services, to automatethe integration of data repositories and web services, and toautomatically generate the “glue code” needed to achieve theintegration. The example described below shows how a marriage of webservices and Semantic Web technologies can further automate thisintegration process. OWL-S represents a partial solution that can beused according to the invention to orchestrate web services. The examplealso illustrates how databases can be mapped into ontologies accordingto the invention, thus making their data available to web services.

An information system will need the capability to integrate new andexisting services, e.g., application programs or web services, andincorporate additional information resources, e.g., relational databasesor XML documents. In most cases application programs and web servicesrequire access to information contained within a data source. Very oftenthe challenge for IS systems is to integrate existing data sources withapplication programs or web services. The use of IS ontologies and anontology management system according to the invention can enable thistype of integration by generating the integration code automatically.

The existing OWL-S web service ontology model provides features forinvoking and accessing web services. However OWL-S does not address theneed for web service integration and interaction with an existingdatabase without first building a web service that abstracts thedatabase. The invention provides a methodology to enable the integrationof new web services with existing databases.

By representing databases and web services in ontology space and bylinking them to the domain ontology, we can now integrate with existingdatabases, and other web services, without developing additional code.Using IS ontologies in this way not only results in code beinggenerated, but also eliminates the need for creating a web serviceontology with composite processes comprised of control flow constructsas defined in OWL-S.

The same Semantic Web technologies that integrate databases and webservices into a single ontological structure may also be used to provideinteroperability between the numerous information systems within modernenterprises. Typically, communities of interest (COI) within anenterprise develop information models with shared semantics andconsistent representations of associated data and web services. In aneffort to address emerging behavior, these large enterprises are movingtowards a service-oriented architecture strategy that provides data andweb service interoperability between the various communities of interestwithin the enterprise.

Building on the semantic framework described above, the invention usesSemantic Web technologies to integrate disparate information systemsinto a service-oriented architecture using OWL/RDF ontologies as well asOWL/RDF mapping relations. Individual information system domainontologies still represent information system data models. However, ourapproach is no longer dependent on building domain ontologies in theGruber sense. Ontological characterizations of the individualcommunity-of-interest data and web service (WS) models are introduced tofacilitate data interoperability across the enterprise. Further, theinvention leverages ubiquitous enterprise concepts like Position, Timeand Types of Things (What, When and Where) and builds a context ontologyfor each that relates all the various representations across theenterprise. The example described below illustrates how contextontologies can integrate the disparate information system ontologiesacross a large enterprise into a single structure providing datainteroperability to each community of interest.

Although the concepts of Position, Time and Types of Things areubiquitous across the enterprise, each concept may have a differentrepresentation within each community of interest or information system.By using OWL/RDF mappings to relate information system domain ontologiesto context ontologies, we are able to resolve such representationaldifferences. Additional OWL/RDF mappings between information systemontologies resolve structural and syntactic mismatches betweenindividual concepts. Further OWL/RDF mappings associate web serviceswith ontologies in a fashion identical to that used when integrating webservices into ontological structures within a single information system.The invention provides a methodology to automate the interoperability ofdisparate information systems by reasoning over the set of informationsystem ontologies, the context ontologies, and their associated OWL/RDFmappings. The reasoning process results in workflow discovery, automaticweb service invocation, and reconciliation of mismatches.

The invention is described below with reference to particular examplessuch as integrating a database with a web service, and “chaining”multiple web services. However, those of ordinary skill in the art willreadily see that the inventive techniques may be used in a variety ofways, including rapidly integrating pluralities of legacy databasesand/or web services into new systems without a need for extensiveprogramming, and selecting appropriate web services from within anontology without a need for user familiarity with the available webservices. Further, the invention may be used not only with an “ontologyviewer” to process individual user queries, but may also be used toconstruct custom Information System (IS):web/database services thataccess underlying databases and web services as needed to processqueries.

The invention is further described by additional examples, such as theintegration of multiple, disparate information systems using a number ofcontext ontologies and a translator web service. Those of ordinary skillin the art will readily see that the inventive techniques may be used ina variety of ways, including the rapid integration of both additionalinformation systems and additional context ontologies into an existingsemantic framework without a need for extensive programming. Further,the invention may be easily generalized to return a number of potentialoutput workflows corresponding to a specified given plurality of inputdata instances on a number of information systems.

EXAMPLE 1 Integrating Databases and/or Web Services into a SearchableOntological Structure

The following example shows how an address database and a map-findingweb service (e.g., the map services available at MapQuest™ and Yahoo!™)may be integrated according to the invention, by performing thefollowing steps:

-   -   Provide a domain ontology    -   Create and link a database component ontology to the domain        ontology    -   Create and link a web service component ontology to the domain        ontology    -   Broker a user request to suggest executable paths    -   Manually view the result of implementing one of the executable        paths through the augmented ontology or automatically generate a        web service to do so

The domain ontology, D, includes the RDF classes Business, Customer,Name, Location, Street, CityOrTown, State, and PostalZIP. D isrepresented as a uniform structure of triples, {subject, relationship,object}, as shown in Table 1. This ontology may also be representedgraphically, as shown in FIG. 2. (Note that the domain ontology mayinclude other classes and relationships; for simplicity, only seventriples from the ontology are shown).

TABLE 1 D:: {Business, sellsTo, Customer} D:: {Customer, doesBusinessAs,Name} D:: {Customer, residesAt, Location} D:: {Location, hasA, Street}D:: {Location, hasA, CityOrTown} D:: {Location, hasA, State} D::{Location, hasA, PostalZIP}

sellsTo, doesBusinessAs, and residesAt are OWL object properties, andhasA is an OWL datatype property with domain rdf:Class and rangedatatype string. As shown, this domain ontology is manually created,typically by experts in the business domain to which it pertains. It maybe possible to automatically create useful domain ontologies usingartificial intelligence techniques or other methods; such ontologies mayalso be used according to the invention.

A database ontology R is then constructed for linking to the domainontology D to form an augmented ontology D⁺. R is the conjunction of adatabase upper ontology R^(U) specifying the structure, algebra, andconstraints of the database, and a database lower ontology R^(L)including the data as instances of R^(U), as follows.

The upper ontology R^(U) specifies the structure, algebra andconstraints of any relational databases in RDF/OWL triples. We definethe RDF classes Database, Relation, Attribute, PrimaryKey andForeignKey. A portion of R_(U) is given in Table 2:

TABLE 2 R^(U)::{Database, hasRelation, Relation} R^(U)::{Relation,hasAttribute, Attribute } R^(U)::{PrimaryKey, subClassOf, Attribute }R^(U)::{ForeignKey, subClassOf, Attribute }where hasRelation and hasAttribute are OWL object properties, andsubClassOf is defined in the RDF schema (available athttp://www.w3.org/TR/rdf-schema/).

Consider a database having a table ADDRESS as depicted in Table 3. (Forbrevity, only two of the data records are shown). The relation ADDRESShas Address_ID as the primary key, and Name, Street, City, State, andZip as attributes. The portion of R^(L) corresponding to this table maythen be constructed (in part) as shown in Table 4.

TABLE 3 Address_ID Name Street City State Zip 001 The MITRE 202 BedfordMA 01730 Corporation Burlington Road 002 XYC, Inc. 255 North ChelmsfordMA 01824 Road

TABLE 4 R^(L)::{Address, isInstanceOf, Relation} R^(L)::{Address,hasAttribute, Address_ID} R^(L)::{Address, hasAttribute, Street}R^(L)::{Address, hasAttribute, Name} R^(L)::{Address, hasAttribute, Zip}R^(L)::{Name, isInstanceOf, Attribute } R^(L)::{Street, isInstanceOf,Attribute } R^(L)::{Zip, isInstanceOf, Attribute } R^(L)::{Address_ID,isInstanceOf, PrimaryKey }

R is then the conjunction of R^(U) and R^(L) as partially shown in FIG.3 (Note that the fields “City” and “State” have not been shown in R forthe sake of brevity, but are linked to the concepts “CityOrTown” and“State” in D in a manner analogous to that shown for “Street” and “Zip”below). If there are entity relationships not captured in the R, thesemay be inserted manually at this stage.

The concepts in database ontology R are then mapped to the concepts inthe domain ontology D to create an augmented domain ontology D⁺. In thisexample, this is done using the relationship hasSource, as shown inTable 5. (Note that the linked concepts need not have identical names,as in the mapping between D::PostalZIP and R::Zip).

TABLE 5 D⁺::{D::Name, hasSource, R::Name} D⁺::{D::Street, hasSource,R::Street} D⁺::{D::PostalZIP, hasSource, R::Zip}

The map-finding web service is then mapped to an upper ontology W^(U)that models web services as concepts of Inputs (inParameters), Output,Classification Conditions, and Effects, as shown in FIG. 4. An instanceof this ontology, W^(I), is created for the map-generation web serviceof the example. This ontology is then mapped to D⁺ to form augmentedontology D⁺⁺, for example using the relationship isInputTo, e.g.D⁺⁺::{D⁺::Location, isInputTo, W^(I)::MapQuest}. A portion of D⁺⁺,showing a link between W^(I) and D⁺, is shown in FIG. 5.

For automatic code generation, in addition to the ontologicalrepresentation of the inputs and outputs of the web service, accessparameters for the web service may be needed. A generic upper ontology,V^(U), that models how to access and invoke a web service is shown inFIG. 6. V^(U) also preserves the structure of any document consumed orproduced by the web service. An instance of this ontology, W^(M), iscreated to describe the parameters for invoking the web service. Asshown in FIG. 7, W_(M) shows parameters for invoking MapQuest™. (Forclarity in the drawing, the URL for MapQuest™,http://mapquest.com/maps/map.adp, is represented as [URL Val]). W_(M) isthen mapped into D⁺⁺, for example using the relationships isValueOf,isServedAt, and hasResult. The isValueOf links the range or object valueof the hasLabel relationship in the ontology W_(M) to concepts in theaugmented domain ontology D⁺⁺. The isServedAt relationship links thesubject or domain of the hasOutput relationships in the D⁺⁺ontology tothe object of the hasUrl relationship in the W_(M). The hasResultrelationship links the range of hasLabel relationship in the W_(M) tothe range of hasOutput relationship in the ontology D⁺⁺. Thisrelationship is useful when the output of the web service contains theinputs of another, as further discussed below. A portion of D⁺⁺including W_(M) and W^(I) is shown in FIG. 8, illustrating the linksbetween the web service ontologies and the domain ontology.

Once the ontology D++has been created, a single application(hereinafter, the “Semantic Viewer”) can broker a variety of userrequests to return data outputs and/or executable glue code. Asimplemented in this example, the Semantic Viewer is a web-enabledapplication. When a user enters any input data, it is linked to conceptsin the augmented domain ontology using the userInput, userOutput, anduserCriteria relationships. The userOutput links a goal the user isinterested in a concept in the augmented domain ontology. The userInputlinks the user's input to the object value of the inParameterrelationship that are not input to web services in the augmented domainontology. The userCriteria is used to link user's input to concepts inthe augmented domain ontology.

For example, suppose a user provides “255 North Road” and “01824” asinputs, and “map” as a requested output. The Semantic Viewer searchesD⁺⁺ for matches to the input values, and locates them in R^(L) as aStreet and a Zipcode, respectively. In addition, it searches for “map”and locates it as a concept in W^(I). It then locates a path through theontology graph linking the inputs with the output, as shown in FIG. 9.Finally, it generates executable glue code to actually invoke the webservice discovered (in this case, MapQuest™) and return a map of therequested address.

The above example user inputs are the same as what would be required ifthe user were simply to visit MapQuest™ and request the map directly,although the user does not have to know how to access MapQuest™ in orderto use the Semantic Viewer as described in this example. If multiple mapservices were available, the Semantic Viewer would present the user withmultiple execution paths, allowing access to whichever map service wasdesired, again without requiring the user to know the URL or dataformatting requirements of each service.

The additional power of the Semantic Viewer can be seen if the userinstead enters “MITRE” as an input, and “map” as a requested output. Noavailable web service in D⁺⁺ takes a corporate name and returns a map.However, the Semantic Viewer still locates “MITRE” in the database as aninstance of Business_Name, and discovers a path through the ontologygraph linking it to the Map output of MapQuest™, as shown in FIG. 10.Thus, the execution path returned now includes a database query todiscover the street address of the MITRE Corporation, formats that datafor the MapQuest™ map service, and returns a map of the companylocation.

In practice, the Semantic Viewer may find multiple execution paths for asingle query. In this case, the user may be presented with the multipleexecution paths and allowed to select the desired path. For example, inthe case where a user enters a single input of “01730” and a desiredoutput of “map,” there are two possible connecting paths through theD++described above. According to one path, illustrated in FIG. 11 a, theSemantic Viewer recognizes “01730” as a zip code, and passes it toMapQuest™ without a street address, resulting in a map of the generalarea around that zip code (the recognition of “01730” as a zip codeaccording to this path may be through its existence in the database, butit is also within the scope of the invention to manually indicate that01730 is a zip code in order for the Semantic Viewer to discover anexecution path). However, there also exists a path, illustrated in FIG.11 b, in which the Semantic Viewer finds each instance of “01730” in thedatabase, and passes each Street Address/Zipcode combination (for eachlisted business having that zip code) to MapQuest™, obtaining maps forevery business in the selected zip code area.

In the above examples, a single output of a single web service has beenthe desired output. However, multiple outputs may also be requested, andthese outputs may not all be derived from the same web service. Forexample, suppose a user enters a single input of “01730” and desiredoutputs of “map” and “business name.” In this case, an execution pathsimilar to the second path described in the previous paragraph exists.The Semantic Viewer recognizes 01730 as a zip code, and queries thedatabase to find all of the businesses in that zip code, requesting boththe business name and the street address. The street addresses and theselected zip code are passed to MapQuest™, and the resulting maps arereturned along with the business names, for each business in thedatabase that matches the criteria.

The Semantic Viewer may also “chain” web services as necessary to obtaina desired result. For example, suppose the domain ontology also containsthe relationships hasVoiceNumber and hasFaxNumber, and has been furtheraugmented to include a reverse lookup telephone service (such as thatfound at http://www.switchboard.com), which accepts an input “telephonenumber” and provides a listing name and address. In this case, when auser enters “781-271-2000” as input and “map” as output, one of thereturned execution paths will include taking the telephone number fromthe database listing, passing it to the reverse lookup service to obtaina street address, and passing the street address to MapQuest™ to obtainthe requested map.

Similarly, web services may operate in parallel on the same or relatedinputs to provide multiple outputs. For example, web services thatprovide maps (as discussed above) and aerial photos (such as that foundat http://www.terraserver-usa.com/) may both be called with the sameaddress information, if an address is input and both “map” and “photo”are selected as outputs.

For each of the above-described examples, the Semantic Viewer haslocated an execution path, and then performed the necessary steps toanswer a user query. However, the execution path may also be used togenerate glue code necessary to create a new service of the typerequested. For example, in the case described above in which a userprovided the input “MITRE” and the output “map,” in addition to simplyproviding a map of the MITRE location, the Semantic Viewer can alsoreturn executable “glue code” for a service that accepts a company name,looks it up in the database to find a street address, passes the streetaddress to MapQuest™, and returns the requested map. This static codecan then be directly used to create a new standalone web service whichis independent of the Semantic Viewer. A flow chart of the resulting webservice, including exemplary pseudocode describing the construction of aSQL database query and a web service URL, is shown in FIG. 12. Ofcourse, the code generated will depend on the specific requirements ofthe database and web service. Further the database type and query syntaxmay be represented in ontological form and linked to the R in ananalogous way to the construction of W_(M) for accessing the webservice, as discussed above.

EXAMPLE 2 Integrating Disparate Information System Ontologies into aSearchable Ontological Structure

The following example illustrates how two disparate information systemswithin a given enterprise may be integrated into a service-orientedarchitecture according to the present invention. The approach buildsupon the semantic framework outlined in Example 1 for the integration ofdatabases and web services within a given information system, and thefollowing example extends this framework to facilitate the integrationof multiple information systems within an enterprise. An overview of theapproach is shown in FIG. 13, and it encompasses the following steps:

-   -   Providing a domain ontology for each information system    -   Providing a context ontology to capture commonly-held concepts        and their representations on each information system    -   Creating and linking a translator web service ontology to the        domain ontology    -   Mapping the individual information system ontologies to the        context ontology    -   Mapping the concepts across individual information system        ontologies    -   Brokering a request to suggest workflows between a data instance        on a source information system and a corresponding concept on a        target information system    -   Automatically executing one, or more, of these generated        workflows to create the corresponding data instance on a target        information system.

The example below discusses the integration of two distinct air-flightscheduling systems, Air Mobility (AM) and Air Operations (AO), withinthe Department of Defense (DoD) enterprise. Each flight schedulingsystem represents an information system, and the concepts thatcharacterize each information system are included within its respectivedomain ontology. The AO and AM domain ontologies are built using OWL/RDFrelations and are represented as a uniform structure of triples in afashion similar to that outlined in Example 1. FIG. 14 and FIG. 15graphically represent selected portions of the AM and AO domainontologies, respectively.

Although both the AM and AO ontologies describe aircraft and relatedevents, FIG. 14 and FIG. 15 indicate that significant distinctions existin the structure, representation and terminology these concepts acrossthe AM and AO systems. For example, the AM ontology represents positionusing the Universal Transverse Mercator (UTM) coordinates, while the AOontology represents position with Geodetic coordinates. A similardiscrepancy is noted between the terminology used by the AM and AOontologies to denote equivalent Aircraft Types. Further, the AM and AOontologies use different overall structures to represent particularflight concepts. For example, the AO ontology provides both starting andstopping times for events, while the AM ontology uses only a singleevent time.

To provide interoperability between information systems in theenterprise, or in this example between the AM and AO systems, a numberof “context ontologies” must be created to account for therepresentational, structural, and terminological mismatches betweenthese respective systems. These “context ontologies” capture conceptscommonly held across the enterprise and account for the representationof a particular concept on each information system. The followingexample addresses three such context ontologies: Position, Time, andTypes of Things, i.e., Where, When and What.

The Position context ontology contains comprehensive specifications ofthe different representations of a Geo-Coordinate point within the AMand AO domain ontologies (e.g., the genus of Coordinate Systems, Datums,Coordinate Reference Frames and formats). In the following example, thePosition context ontology is based on the set of coordinate systems usedby National Geospatial Agency (National Imagery and Mapping Agency(NIMA) USA, GEOTRANS 2.2.4-Geographic Translator, as described athttp://earth-info.nima.mil/GandG/geotrans, incorporated herein byreference).

A small portion of the overall Position context ontology is presented inUsing this ontology, any Geo-Coordinate position may be disambiguated byspecifying its Coordinate System, its Coordinate Reference Frame, andits Datum using OWL classes and object properties. These particularclasses, and the relationships between them, are defined in the contextontology, and a partial listing of these definitions is given below (fora more complete listing of classes and relationships, see Sabbouh, etal., “Using Semantic Web Technologies to Enable Interoperability ofDisparate Information Systems,” as described athttp://www.mitre.org/work/tech_papers/tech_papers_(—)05/05_(—)1025/05_(—)1025.pdf,incorporated herein by reference in its entirety):

<owl:Class rdf:id=”COORDINATE”/> <owl:Class rdf:id=”DATUM”/> <owl:Classrdf:id=”COORD-REF-FRAME”/> <owl:ObjectProperty rdf:ID=″ Has-Datum″><rdfs:domain rdf:resource=″#COORDINATE ″/> <rdfs:rangerdf:resource=″#DATUM ″/> </owl:ObjectProperty> <owl:ObjectPropertyrdf:ID=″ Has-Coord-Ref-Frame″> <rdfs:domain rdf:resource=″#COORDINATE″/> <rdfs:range rdf:resource=″#COORD-REF-FRAME″/> </owl:ObjectProperty>

Context ontologies are specified in a similar fashion for the ubiquitousTime and Types of Things concepts. A number of Time ontologies arepublicly available (e.g., Hobbs, J., “A DAML ontology of time”, 2002, asdescribed athttp://www.cs.rochester.edu/˜ferguson/daml/daml-time-nov2002.txt,incorporated herein by reference), and these ontologies are generallybased on the numerous representations of date and time used by themilitary (e.g., mm/dd/yyyy, Zulu, EST, etc.). The OWL classes andrelationships used within the Time context ontology thus closely mimicthose used in the Position context ontology.

Two different variants of the Types of Things context ontology have beendiscerned for the AM and AO systems: Aircraft-Types, and Event-Types.These context ontologies contain the different representations ofaircrafts and events that are used by both systems, and as a result, twosubtypes for Aircraft-Types and Event-Types have been developed. Table 6provides a partial listing of these subtypes.

TABLE 6 AM-AIRCRAFT-TYPES subClassOf AIRCRAFT-TYPES AO-AIRCRAFT-TYPESsubClassOf AIRCRAFT-TYPES F-16 instanceOf(OWL Individual)AM-AIRCRAFT-TYPES F-16E instanceOf(OWL Individual) AO-AIRCRAFT-TYPESAM-EVENT-TYPES subClassOf EVENT- TYPES AO-EVENT-TYPES subClassOf EVENT-TYPES DEP instanceOf(OWL Individual) AM-EVENT- TYPES TAKEOFFinstanceOf(OWL Individual) AO-EVENT- TYPES

A “translator web service” must be associated with each context ontologyto translate between the distinct data representations on eachinformation system ontology. The following example utilizes the GeoTransweb service, a translator web service based on the Geographic Translator(see National Imagery and Mapping Agency (NIMA) USA, GEOTRANS2.2.4-Geographic Translator, http://earth-info.nima.mil/GandG/geotrans/,incorporated herein by reference in its entirety). Althoughconcentrating on a particular web service and the Position contextontology, the approach outlined in the example is easily extended toinclude other appropriate web services and additional contextontologies.

The integration of the GeoTrans web service into the Position contextontology is outlined in FIG. 17, and the approach follows the processoutlined in Example 1 for the integration of a generic web service intoa domain ontology. Prior to integration, the GeoTrans WSDL ontology mustbe re-created from the GeoTrans WSDL file stored in the ontologymanagement system. The GeoTrans web service must also be mapped to anupper ontology to create a GeoTrans web service upper ontology. TheGeoTrans upper ontology and the GeoTrans WSDL ontology are then linkedto the Position context ontology using OWL/RDF mappings to form anaugmented ontology.

FIG. 18 graphically presents a portion of the GeoTrans upper ontology,the WSDL ontology, and the mappings that connect each to the Positioncontext ontology. The OWL/RDF mappings in FIG. 18, as well as theirdomains and ranges, are summarized in Table 7. While the example employsthese mappings to connect a web service to a context ontology, thesesame mappings could be employed to connect a generic web service to anIS domain ontology.

The relevant concepts in the AM ontology must then be mapped to thecorresponding concepts in the AO ontology to enable the exchange of datainstances between the AM and AO information systems. This requiredmapping occurs in two steps. The individual concepts in the AM ontologyare first matched with corresponding concepts in the AO ontology togenerate semantic matches between concepts. Concepts within the AM andAO ontologies are then independently mapped to the context ontology toresolve representational mismatches. Note that concept matching requiresagreement between the AO and AM users, whereas mapping to contextontologies is done independently for each system. Table 8 lists a numberof the OWL/RDF mappings that link concepts in the AM and AO domainontologies to the context ontology, and FIG. 19 presents a portion ofthe AO ontology fully mapped to the AM and Position context ontologies.

TABLE 7 OWL/RDF Domain Mappings Range rdfs: Class isInputOf Webservice:Class (in Context Ontology) (in WS upper ontology) Webservice: ClasshasInput rdfs: Class (in WS upper ontology) (inverseOf (in ContextOntology) isInputOf) rdfs: Class isOutputOf Webservice: Class (inContext Ontology) (in WS upper ontology) Webservice: Class hasOutputrdfs: Class (in WS upper ontology) (in Context Ontology) Webservice:Class inParameter rdfs: Class (in WS upper ontology) (in ContextOntology) Webservice: Class outParameter rdfs: Class (in WS upperontology) (in Context Ontology) Webservice: Class hasEffect rdfs: Class(in WS upper ontology) (in Context Ontology) Webservice: ClasshasClassification rdfs: Class (in WS upper ontology) Condition (inContext Ontology) rdfs: Class isValueOf rdfs: Class (in ContextOntology) (in WS WSDL Ontology) rdfs: Class isOutputValueOf rdfs: Class(in Context Ontology) (in WS WSDL Ontology) rdfs: Class hasResult rdfs:Class (in Context Ontology) (in WS WSDL Ontology) rdfs: ClassisCorrelatedWith rdfs: Class (in Ontology) (in Ontology)

TABLE 8 OWL Object Domain Property Range When to Use rdfs: ClasshasContext rdfs: Class Representational (in IS Ontology) (in Contextchange Ontology) rdfs: Class isTheContextOf rdfs: Class Representational(in Context (inverse of (in IS change Ontology) hasContext) Ontology)rdfs: Class hasMatch rdfs: Class Representational (in IS Ontology)(symmetric) (in IS change Ontology) rdfs: Class hasMatchingValue rdfs:Class No (in IS Ontology) (symmetric) (in IS representational Ontology)change

The OWL/RDF mappings in Table 8 are asserted to match equivalentconcepts in the AM and AO domain ontologies and to resolve the resultingmismatches. The following mapping could be asserted when instance valuescan be copied from the AM system directly to the AO system withouttransformation:

-   -   AIR-FIELD-NAME hasMatchingValue AM-AIRPORT-NAME.        Representational mismatch between coordinates in the AO and AM        domain ontologies could be resolved through the assertion of the        following sequence of mappings:    -   AO-COORD hasContext LATLONHTCOORDINATE_WGE    -   AM-COORD hasContext UTMCOORDINATE_WGE    -   AO-COORD hasMatch AM-COORD        To reconcile terminology mismatches between various event types        or various aircraft types in the AM and AO systems, the        following mappings could be asserted:    -   AM-AIRCRAFT-TYPES hasMatch AO-AIRCRAFT-TYPES    -   AM-EVENT-TYPES hasMatch AO-EVENT-TYPES    -   F-16E OWL:sameAs F-16    -   DEP OWL:sameAs TAKEOFF    -   ARR OWL:sameAs LANDING        The approach outlined by the present example, although        describing only a small number of potential mappings, can be        easily extended to include any number of appropriate OWL/RDF        mapping relations.

Having accomplished the mapping of web service ontologies and ISontologies to context ontologies, the AM database is then mapped ontothe AM domain ontology. The mapping process closely follows the approachoutlined in Example 1, and once linked, the AM data are treated asinstances of the AM ontology. FIG. 20 presents a portion of theresulting augmented AM ontology.

To create AM data on the AO system, the AM data instances must betranslated to match the structure, terminology, and representations ofconcepts on the AO system. To accomplish this end, a mappinginterpreter, Onto-Mapper, is built to process the OWL/RDF mappingspresented in Table 8 and to create new AO instance data fromcorresponding data on the AM system. In the semantic framework of theinvention, the Onto-Mapper is built as a specialized web service thatacts as a Service Agent to be invoked only upon data exchange. Ratherthan have arbitrary inputs and outputs, this specialized web serviceinterprets the RDF/OWL links to create instance data in a target systemfrom data that originated in a source system. Table 9 outlines theformal definition of the Onto-Mapper.

TABLE 9 Service-Agent subClassOf Web-Service Onto-Mapper subClassOfService-Agent hasAgentInput subProperty hasInput hasAgentOutputsubProperty hasOutput isAgentInputOf subProperty isInputOfisAgentOutputOf subProperty isOutputOf isAgentInputOf InverseOfhasAgentInput isAgentOutputOf InverseOf hasAgentOutput

Then, to translate an AM instance data into an AO instance data, theSemantic Viewer reasons with the mapped ontologies to discoverworkflows, to invoke/execute/process web services, and to create thecorresponding AO instance. The reasoning process employs a combinationof graph-traversal algorithms and invocations of the Onto-Mapper. Theinvention makes extensive use of two graph traversal algorithms: DirectPath Query (DPQ), and Incoming Intersection Query (IIQ).

Given a list of input values and a desired output, the DPQ creates theset of all the direct paths that lead to the desired output concept fromthe input concepts. The DPQ algorithm may be defined more formallybelow:

-   -   For input list i_(n), output v    -   Find the direct paths P_(k) {p₁,p₂, . . . } ending with v, and        starting with each i in i_(n), where a direct path is the        sequence of nodes (i.e., concepts in the ontology graph) and        relations or links that connects them. The system can be        configured to exclude nodes connected by specific links or to        only return paths containing certain links.

The IIQ algorithm creates the set of all the direct paths that lead tothe desired output concept using a DPQ. For each input value, thealgorithm then creates the set of direct paths that lead to the giveninput. Third, the algorithm calculates and returns the intersection ofthese sets. This IIQ algorithm may be defined more formally as follows:

-   -   For input list i_(n), output v    -   Find the list of nodes x_(i) {x₁, x₂, . . . } that has direct        paths P_(k) {p₁,p₂, . . . } with v    -   For each i in i_(n), find the list of nodes y_(j) {y₁, y₂, . . .        } that has direct paths Q_(m) {q₁,q₂, . . . } with i    -   Return {x_(i), P_(k), Q_(m)} where x_(i)=y_(j)

The following example illustrates the translation of a particular datainstance on the AM system into corresponding data instance on the AOsystem. A user specifies the AM-FLIGHT instance AM-FLIGHT-AS1040300041,shown in FIG. 20, as an input to the Semantic Viewer and an instance ofAOFLIGHT as an output. To discover the initial workflow, the SemanticViewer first runs a DPQ and, if no workflow is found, an IIQ. The graphtraversal algorithms exclude the inparameter, outParameter,hasMatchingValue, hasMatch, hasContext, and is TheContextOfrelationships from the initial search. The returned paths areinterpreted as a workflow by the Semantic Viewer if they contain therelationship isInputOf or any its sub-properties. An example of aninitial execution path returned by these queries is shown in FIG. 21.

The presence of the RDF/OWL link is AgentInputOf indicates that theSemantic Viewer must invoke the Onto-Mapper to create an instance of theAO-Flight (the object of the hasAgentOutput RDF/OWL link and the outputof the OntoMapper). To trigger the invocation of Onto-Mapper, theSemantic Viewer asserts that “Onto-Mapper hasAgentInput AM-FLIGHT” andthat “Onto-Mapper hasAgentOutput AO-FLIGHT.” The Onto-Mapper searchesfor representational and terminology mismatches by interpretinghasContext, is TheContextOf, and hasMatch links. The result is a set ofworkflows (i.e., paths containing is InputOf or its sub-properties) thatconsist of a sequence of web services that must be processed toreconcile mismatches between the AM and AO domains.

An example of such a workflow is shown in FIG. 22. A quick scan of thisworkflow reveals the pathway that will be executed by the GeoTrans webservice to derive the instance value of AO-COORD from COORD: 21 N20678076 5423265 (an instance of AM-COORD). In addition to the exampleworkflow, the Semantic Viewer finds additional workflows for otherinstances of AM-COORD that need translation. Although not presented inthe example, workflows are discovered for representational mismatchesbetween the Time and Type of Things context ontologies in a similarfashion.

The Semantic Viewer processes the workflow on a node-by-node basis. Inthe example workflow of FIG. 22, the first node isAM-FLIGHT-AS1040300041, followed by AS1040300041 100, LOC:CYQX, AM COORD21 N 20678076 5423265, and UTMCoordinate-WGE. Since AM-COORD 21 N20678076 5423265 inherits the hasContext link to UTMCoordinate-WGE fromAM-COORD, the Semantic Viewer makes AM-COORD 21 N 20678076 5423265 aninstance of UTMCOORDINATE-WGE. This process links AM-COORD 21 N 206780765423265 to GeoTrans using InputOf, and it indicates that the GeoTransweb services must be invoked to translate between the AM and AO systems.Further, since hasOutput links LATLONHTCOORDINATE-WGE to the GeoTransweb service, the resulting output of GeoTrans is an instance ofLATLONHTCOORDINATE-WGE. The processing of the is TheContextOf link makesthe instance of LATLONHTCOORDINATE-WGE an instance of AO-COORD.

The translation step first requires the construction of the URLnecessary to invoke the GeoTrans web service. The base URL and parameternames are read from the WSDL ontology, and the parameter values areinferred from the mapped ontology. Specifically, when execution ofGeoTrans is requested, its full definition is retrieved from theontology management system, and the base URL,http://base.mitre.org/Geotrans/, is retrieved from the GeoTrans WSDLontology. Then for each object of inParameter (e.g. COORDREFFRAME), theSemantic Viewer runs the DPQ with input “COORD: 21 N 20678076 5423265”and with output being the object of inParameter (e.g. COORDREFFRAME).

FIG. 23 illustrates an example of a path returned from the DPQ. The pathcontains the parameter value to be used in the URL for that object ofinParameter (e.g. COORDREFFRAME). This parameter value is identified inthe returned path as the instance of the object of inParameter (e.g.,COORDREFFRAME). For example, the value of COORDREFFRAME value is UTM forthe example pathway in FIG. 23. The parameter name is then determined byfollowing the link isValueOf, which reveals the label of the parametername (inputCRF). When the Semantic Viewer has processed all inParameterlinks, the resulting URL will take the form:

http://base.mitre.org/Geotrans/inputCRF=UTM&inputDatum=WGE&CoordString=21N 20678076 5423265

Once all of the objects of each inParameter are processed, the SemanticViewer turns its attention to outParameter. Similar to the processingfor inParameter, the Semantic Viewer repeatedly runs the DPQ with input“COORD: 21 N 20678076 5423265” and output being each object ofoutParameter (e.g., COORDREFFRAME). It then finds the matching parameterlabel using the isValueOf link. The complete URL is of the form:

http://base.mitre.org/Geotrans/inputCRF=UTM&inputDatum=WGE&CoordString=21N206780765423265&outputCRF=Geodetic&outputDatu m=WGE.

When the invocation of GeoTrans returns the XML document, the SemanticViewer translates it into an RDF instance of AO-COORD, i.e., AO-COORD:48.936668,-54.568333. Elements in the XML document are first matchedwith concepts in the WSDL ontology. The latter are linked to the mappedontology using the isValueOf and is OutputValueOf links. This creates aLATLONHTCOORDINATE-WGE instance from the XML document. The SemanticViewer then completes the workflow processing by reclassifying theLATLONHTCOORDINATE-WGE instance as an AO-COORD due to the isTheContextOf link between LATLONHTCOORDINATE-WGE and AO-COORD. TheSemantic Viewer also creates an is CorrelatedWith link is betweenAM-COORD: 21 N 20678076 5423265 and AO-COORD: 48.936668,-54.568333.

Once these workflows are executed, the algorithm presented in FIG. 24creates a new data instance in AO domain. This new instance is thenlinked to the AM instance AM FLIGHT AS1040300041 using the isCorrelatedWith link and imported to the ontology management service. Tohelp the reader understand this algorithm, notional representations ofthe processing for the hasMatchingValue and hasMatch relationships aredepicted in FIG. 25 and FIG. 26, respectively.

In the above examples, a single input data instance on the AMinformation system was input into the Semantic Viewer, and a singlecorresponding data instance in the AO information system was created.The algorithms of the present invention may be generalized to returnmultiple initial workflows that corresponding to multiple, input datainstances.

FIG. 27 illustrates a case in which the outputs from the execution ofmultiple web services serve as necessary inputs to future chained webservices. In FIG. 27, the first workflow is composed of the sequencedweb service A, followed web service C, and finally followed by theOnto-Mapper. The execution of A succeeds and its result is stored in theontology management system. However, the execution of C fails, since theoutput of web service B is needed as input to web service C. The systemproceeds to execute the second workflow, i.e., web service B, followedweb service C, and finally followed by the Onto-Mapper. This workflowsuccessfully completes since the necessary output from A was stored andis currently available.

The present approach, as described by the above example, is alsoextensible. The previous examples concentrate on the mapping of a numberof information system ontologies to a number of context ontologies of afixed size. The invention also allows the incorporation of new contextontologies and the extension of existing context ontologies without anymodification to existing mappings. In this fashion, a given contextontology may be extended to account for conceptual representations in alarger number of existing information systems. Further, new contextontologies and mappings may be created to embrace additional enterpriseconcepts without modifying the mapping of existing context ontologies toinformation systems.

EXAMPLE 3 Integrating Disparate Web Services or Web Sites Using aSemantic Web Scripting Language (SWSL)

The following example illustrates a method and a scripting paradigm forautomatically integrating disparate information systems (e.g., webservices and web sites) within a given enterprise using aservice-oriented architecture. The approach builds upon the semanticframework outlined in Examples 1 and 2, in which a new web serviceenables information exchange between two or more disparate informationsystems, web services, or databases. The newly-developed web service maybe associated with a corresponding new data model, or schema, thatserves to harmonize the various data models involved in the informationexchange. To facilitate the transfer of information, the basic conceptscontained within data models of the disparate information systems, webservices, and databases must be migrated and persisted in the new datamodel. The embodiments below describe how the method and the scriptingparadigm automatically derive the new data model and associatedontological structures and achieves the integration of web services orweb sites that integrate the disparate information systems.

As an alternative to the scripting paradigm of the semantic webscripting language (SWSL), the newly-developed web service that achievesthe integration can be automatically generated in the form of C# codefrom the WSDL files of the legacy web services, from pairwise mappingsbetween the WSDL files, and from mappings between the WSDL files and oneor more shared contexts. In the following example, context signifies anyweb service and its associated WSDL that is able to bridge differencesbetween entities in the legacy schemas. The automatically-generated C#code is associated with an automatically generated information systemontology which is mapped to information system ontologies of the webservices in the information flow and to context ontologies.

FIG. 28 is a detailed flow diagram of a method 2800 for automaticallyderiving a new data model that facilitates the exchange of informationbetween a target information system A and a source information system B.A number of ubiquitous concepts (e.g. time, position, etc.) characterizethe source and the target information systems, and these concepts may becontained within WSDL files that are specific to each informationsystem. The information system ontologies may be built from the WSDL orSAWSDL, and OWL/RDF relations and may be represented as a uniformstructure of triples. Although the concepts characterizing eachinformation system may be similar, significant distinctions may exist inthe semantics, representation, and terminology of these concepts on therespective information systems.

Within FIG. 28, the new data model, schema Z, is created in step 2802and subsequently populated with the concepts that characterize thetarget information system B. Then, in step 2804, the concepts within thetarget information system A and the concepts within the sourceinformation system B are mapped to a shared context. The concepts withinthe newly-populated schema Z are then mapped in step 2806 to theconcepts within the source information system A to identify conceptualmatches between the source information system A and the newly-populatedschema Z, and to the concepts within the target information system B toidentify conceptual matches between the target information system BA andthe newly-populated schema Z.

The matching process proceeds by asserting a number of OWL/RDF mappingrelations, or simply mapping relations, to map concepts within thenewly-populated schema Z to associated concepts within the sourceinformation system A. An appropriate set of OWL/RDF mapping relationsare used to reconcile syntactic, structural, and representationalmismatches between the concepts. The appropriate OWL/RDF mappings havebeen described in Table 8 of Example 2. For example, these mappingrelations may be captured in an Excel worksheet, in which case, theybecome mapping relations rather than OWL/RDF mapping relations.

For example, the newly-populated schema Z may include a concept calledtime, and an associated concept within the source information system Amay be called starttime. If time and starttime are identical from asemantic and a representational perspective, then the value ofA::starttime may be directly assigned to Z::time. The following OWL/RDFmapping relation may then be asserted to perform the assignment:

-   -   A:: starttime has-matchingvalue Z:: time.

Alternatively, the time concept may be a semantic match with thestarttime concept, except the representation of the time concept maydiffer significantly from the representation of the starttime concept.For example, the new schema Z may express time using the Zulu time zoneformat, while target information system A may express starttime in theEastern time zone format. In such a case, the following OWL/RDFrelations may specify the representational details of the respectiveconcepts:

-   -   A:: starttime has-context Eastern-time-zone; and    -   Z:: time has-context Zulu-time-zone-format.        The following OWL/RDF mapping then indicates that the        representation of A::start time must be reconciled with the        representation of Z::time before being assigned to Z::time:    -   A:: starttime has-match Z:: time.

Further, terminological mismatches may also be identified betweenconcepts that are identical from a semantic and a representationalperspective. In such a case, the following OWL/RDF mapping relation maybe asserted to resolve a terminological mismatch between A::starttimeand Z::time:

-   -   A:: starttime owl::sameAs Z:: time.

Once the concepts within new schema Z are mapped to the concepts in thesource information system A, the source information system A is scannedin step 2808 to identify concepts that have no corresponding match inthe new schema Z, but are needed to be included in schema Z. Theseidentified concepts must then be migrated into and persisted in the newschema Z within step 2810. For each identified concept within sourceinformation system A, a corresponding new concept is created within thenew schema Z. The new concepts in schema Z are then mapped to thecorresponding identified concept in the source information system A byasserting the OWL/RDF relation has-matchingvalue.

Steps 2806, 2808, and 2810 generate the new schema Z and itscorresponding metadata, and this new schema is output in step 2810. Theset of mapping relations used to define new schema Z may be captured bya script writer in text format, in a Microsoft Excel document, in HTML,or in any other appropriate format for later use. Further, the processdescribed by FIG. 28 is not limited to a single source informationsystem and a single target information system. The present invention maylink any number of source information systems, databases, or webservices to any number of target information systems, databases, or webservices using the new schema, Z.

FIG. 29 is a detailed flow diagram of an exemplary method 2900 forautomatically developing a new web service that integrates disparateinformation systems in accordance with the present invention. In step2902, WSDL files that correspond to each of the disparate web servicesor web sites are loaded into memory, and an equivalent ontologicalrepresentation, known as a WSDL ontology, is created for each WSDL file.From the WSDL files, the information system ontologies are also created.Thus, for source information system A and target information system B,step 2902 loads the respective source and target WSDL files andgenerates the respective source WSDL and information system ontologiesand target WSDL and information system ontologies. Additionally, mappingrelations or links are created between concepts in the source WSDLontology and source information system ontology. Similarly, mappingrelations or links are created between concepts in the target WSDLontology and the target information system ontology. The mappingsbetween the WSDL ontology and the information system ontology may beperformed using the mapping relations of Table 7 above.

Additionally, step 2902 loads contexts from WSDL files of translatorservices, and generates a corresponding set of context ontologies, suchas a position context ontology associated with a position translator anda time context ontology associated with a time translator.

Alternatively, a semantically-annotated WSDL file (SAWSDL) may be loadedin step 2902 for each of the disparate information systems, andcorresponding information system ontologies may be created from theSAWSDL files. The resulting information system ontology may correspondto the “type” section of an associated WSDL file, but with theannotating concepts of the SAWSDL file replacing the annotated XMLelements in the WSDL file. Additionally, a mapping relation or a link iscreated between the annotating concept in the SAWSDL ontology and theannotated concept in a corresponding information system ontology.

Once the source and the target WSDL ontologies, context ontologies, andinformation system ontologies have been created, a new schema Z and aset of OWL/RDF mappings that define the new schema Z are loaded intomemory in step 2904, and a corresponding information system ontology Zis built. In a preferred embodiment, the new schema Z may be definedusing the techniques described in FIG. 28. Alternatively, the new schemaZ may be derived from any alternative technique that provides a set ofappropriate OWL/RDF mappings relations.

The information system ontology Z contains concepts from the targetinformation system ontology. In step 2906, each concept within thetarget information system ontology may be mapped to the correspondingconcept in the information system ontology Z by asserting an appropriatemapping relation, such has-matchingvalue or owl:equivalentClass, or anappropriate hasMatch relation between the concepts. Alternatively, themapping process within step 2906 may assert a different mapping relationbetween the concepts, or the mapping process may not assert any relationbetween the concepts. The process by which these relations are chosenand mapped is described below in reference to FIG. 33 and FIG. 34, andvariations in these processes are possible without departing from thespirit and scope of the present invention.

The concepts that characterize the information system ontology Z aresubsequently matched in step 2908 to concepts that characterize thesource information system ontology. The matching process within step2908 utilizes the set of OWL/RDF mappings that define the new schema Zand that were loaded with the new schema Z in step 2904. Thus, for eachOWL/RDF mapping relation within the new schema Z (e.g., A:: starttimehas-matchingvalue Z:: time), an identical relation is asserted betweenthe corresponding pair of concepts in the source information systemontology and the information system ontology Z.

The matching process within step 2908 also increases the amount of datawithin the information system ontology Z, as the information systemontology Z stores a set of mappings that link the information systemontology Z with both the source and target information systemontologies. Thus, the resulting information system ontology containsmore information than was originally present within the source andtarget information system ontologies. Further, upon completion of step2908, the information system ontology Z is correctly mapped to both thesource information system ontology and the target information systemontology. As these source and target information system ontologies havebeen previously mapped to their associated context ontologies using theprocess described in Example 2, the information system ontology is alsocorrectly mapped to the set of context ontologies.

In step 2910, a web service and a corresponding WSDL/SAWSDL file isgenerated for the new schema Z from the corresponding information systemontology Z and from the context ontologies to which the informationontology Z is mapped. The web service also implements the workflowspecified in the SWSL script. By creating an appropriate web servicedescription file (WSDL) of the new schema, the web service may beinvoked to broker information requests between the source and targetinformation system and to facilitate data interoperability between thesource and target information systems.

Although described in terms of a single source and target informationsystem, the process outlined in FIG. 29 may integrate multiple sourceand multiple target information systems using a single, additional webservice. Further, the present invention may incorporate any of a numberof information systems, including a variety of web services (e.g., websites) and a variety of structured databases abstracted by web services.

In general, data modelers design the new schema Z and information systemontology, and programmers generate the source code that implements thenew web service that expose the data. The source code is often compiledand executed using specific software and hardware components after acommunity-of-interest has agreed on the shared vocabulary described inthe data model and the information system ontologies. Further, thetechnical knowledge required to develop and execute the source code maynot be universally held throughout the community-of-interest, resultingin a significant disconnect between those with the greatest knowledge ofthe individual information systems and those that integrate theinformation systems. Thus, the integration process outlined within FIG.28 and FIG. 29, if done manually, may not operate efficiently, as thecommunity-of-interest must first develop a shared vocabulary before theprogrammer can begin the implementation of the web services thatintegrate the information systems.

An additional embodiment of the present invention is a new scriptingparadigm, a Semantic Web Scripting Language (SWSL), that allows a scriptwriter to achieve the integration of web services and web sites andautomatically build the shared vocabulary consisting of informationsystem ontologies, mappings, and context ontologies. SWSL is built uponexisting, open-source web-scripting languages, such a JScript, VBScript,and Ruby, and SWSL scripts combine these scripting languages with HTMLand metadata and may be executed using a web browser (or alternatively,an appropriate server). By abstracting mediation and workflowcomposition, and requiring only knowledge of basic scripting languagesand a web browser, the use of SWSL as a scripting paradigm encouragesmass participation in the integration process.

FIG. 30 is an exemplary SWSL script that specifies the steps necessaryto integrate two disparate information systems, the Mobility Air ForceCommand (MAF) and the Combat Air Force Command (CAF), within theDepartment of Defense enterprise. Each information system is abstractedby a web service, which is characterized by a data model, and eachrespective data model incorporates the ubiquitous concepts thatcharacterize each respective information system. The data model andconcepts that characterize each respective web service may beincorporated into a corresponding WSDL file.

The exemplary SWSL script of FIG. 30 is comprised of five major blocks,along with the necessary initialization and steps. The exemplary SWSLscript first imports context ontologies or context WSDLs that bridgemismatches between entities or attributes in the legacy systems involvedin the information flow. The exemplary SWSL script could additionallyimport any number of context ontologies or context WSDLs that would berelevant to the disparate information systems. Furthermore a SWSL scriptmay import other SWSL scripts.

The exemplary SWSL script then imports WSDL files for each of thedisparate web services or web sites that are subject to integration. TheWSDL importation block loads WSDL files that correspond to the MAFsystem (maf.wsdl) and the CAF system (caf.wsdl). These WSDL filesincorporate the concepts and the data models that characterize eachrespective information system, and these WSDL files are used to buildcorresponding WSDL ontologies and information system ontologies,including a MAF WSDL ontology, a MAF information system ontology, a CAFinformation system ontology, and a CAF WSDL ontology.

Once the necessary WSDL files and the necessary context ontologies havebeen loaded into memory, the exemplary script then aligns the conceptsthat characterize the MAF and the CAF systems. The alignment blockderives a new schema by mapping concepts in the MAF information systemontology to concepts in the CAF information system ontology using a setof OWL/RDF relations, as was described above in reference to FIG. 28.The alignment block may also result in changes to the information systemontologies and to the context ontologies.

Within the alignment block, has-matchingValue is asserted with argumentsmaf-callsign and caf-callsign, thus indicating that maf-callsign conceptand the caf-callsign concept are identical from a semantic andrepresentational perspective. Accordingly, the value of maf-callsign maybe immediately assigned to caf-callsign within the new schema.Additionally, the assertion of has-match with arguments maf-time andcaf-time indicates that the concepts are semantically equivalent, butdifferent in representation. As such, the representational differencesbetween these concepts must be resolved before the value of maf-time maybe assigned to the value of caf-time. For example, maf-time has-contextZulu time zone, and caftime has-context ET-time-zone are asserted toreconcile representational mismatch. Further, by asserting theowl::sameAs relation, the exemplary SWSL script indicates that theLanding and ARR concepts exhibit a terminological mismatch while beingidentical from a representational and a semantic perspective.

The mapping operations within the alignment block in the SWSL scripteffect the mappings between the information system ontologies of thesource and target system and help derive the new schema by defining thesemantic, representational, and terminological matches between theconcepts of the new schema and the concepts of the MAF and CAFinformation system ontologies. The newly-generated schema also forms thebasis for a new information system ontology and a corresponding webservice that integrate the MAP and CAF information systems. Theprocessing of the alignment block may also result in changes to the MAFand CAF information system ontologies.

The exemplary SWSL script then orchestrates workflows from the MAF webservice to the CAF web service through the newly created web service. InSWSL the workflows are specified using the “Step” functionality of SWSL.As an alternative to expressing the workflow in scripting, the workflowcan be expressed in HTML. The set of OWL/RDF mappings that define thenew schema Z are used to establish similar mappings between theinformation system ontology Z, the MAF information system ontology, andthe CAF information system ontology, as was described above in referenceto FIG. 29. These mappings then serve to reconcile syntactic,structural, and representational mismatches between the MAF and the CAF.

Once workflows between concepts in the respective WSDL ontologies havebeen orchestrated, any remaining terminological, semantic, andrepresentational mismatches are resolved within the mediation block. Theresulting information system ontology incorporates the concepts from theinformation system and context ontologies, as well as the set ofmappings between concepts in the MAF and CAF information systemontologies. As such, the resulting information system ontology maycontain more data than is contained within the WSDL ontologies. Further,a web service and corresponding WSDL file or SAWSDL file may be createdfor the new information system ontology.

FIG. 31 is a second exemplary SWSL script that specifies the stepsnecessary to integrate two disparate information systems, the MobilityAir Force Command (MAF) and the Combat Air Force Command (CAF), withinthe Department of Defense enterprise. As was discussed in reference FIG.30, each information system is abstracted by a web service, which ischaracterized by a data model, and each respective data modelincorporates the ubiquitous concepts that characterize each respectiveinformation system.

The essential components of the exemplary SWSL script of FIG. 31 arefunctionally identical to those outlined in regards to the exemplaryscript of FIG. 30. However, the exemplary script of FIG. 31 employs HTMLto align the concepts that characterize the MAF and the CAF informationsystems and to derive a new schema by mapping concepts within the MAFand CAF information system ontologies using a set of mapping relations.

The exemplary SWSL script first imports the data models and the conceptsthat characterize the web services of the respective MAF and CAFinformation systems. In the embodiment of FIG. 31, the data models andconcepts are obtained by loading either WSDL files of the MAF and CAFservices or information system ontologies for each of the respective MAFand CAF information systems. The exemplary SWSL script then loadscontext ontologies, or context WSDLs, that describe the variousrepresentations of position and time across the MAF and CAF enterprises.The information system ontologies and the context ontologies may bebuilt from OWL/RDF relations and may be represented as a uniformstructure of triples in a fashion similar to that outlined in Examples 1and 2. Further, although described only in terms of the time andposition context ontologies, the exemplary SWSL script of FIG. 31 mayimport any number of context ontologies, schemas, or WSDL files thatwould be relevant to the information systems subject to integration.

Alternatively, the exemplary SWSL script may obtain the data models andconcepts by loading a WSDL file for each of the disparate web servicesand web sites that are subject to integration. These WSDL filesincorporate the concepts and the data models that characterize therespective MAF and CAF information systems, and these WSDL files couldthen be used to build corresponding WSDL ontologies and informationsystem ontologies, including a MAF WSDL ontology, a MAF informationsystem ontology, a CAF information system ontology, and a CAF WSDLontology. Also alternative to loading context ontologies, one can loadWSDL files of translator services that are needed to be invoked by themediator ontomapper.

In additional embodiments, the exemplary SWSL script may obtain thenecessary data models and concepts by loading a .SWSL script file foreach of the disparate web services and web sites that are subject tointegration. Further, although not described within the embodiment ofFIG. 31, the exemplary SWSL script may import any number ofpreviously-defined SWSL scripts.

The exemplary SWSL script of FIG. 31 employs HTML to import the datamodels and concepts that characterize the MAF and the CAF informationsystems. As such, the exemplary SWSL script may be run using aconventional web browser. Alternatively, the exemplary SWSL script couldemploy existing, open-source web-scripting languages, such as Jscript,VBScript, and Ruby, to load the necessary schema, ontologicalstructures, and WSDL files. In such a case, the resulting SWSL scriptcould be executed using an appropriate server. In additionalembodiments, the exemplary SWSL script could employ any combination ofHTML code and web scripting languages to import the data models andconcepts that characterize the respective MAF and CAF informationsystems.

Once the necessary data models and concepts have been loaded intomemory, the concepts that characterize the individual MAF and CAFinformation systems are aligned by the exemplary SWSL script within analignment block. The alignment block derives a new schema by mappingconcepts in the MAF information system to concepts in the CAFinformation system ontology, and by mappings these concepts to one ormore shared contexts using a set of mapping relations, as was describedabove in reference to FIG. 28.

The exemplary SWSL script of FIG. 31 employs HTML tags to definemappings between concepts in the respective MAF and CAF informationsystems using a set of mapping relations, and the exemplary alignmentblock of FIG. 31 employs the owl:equivalentClass relation to definesemantic and representational matches between concepts in the MAF andCAF information system ontologies (the relation is equivalent to thehas-matchingvalue relation described above in reference to Example 2).

For example, the inclusion of the maf-callsign concept and thecaf-callsign concept within the tag for the owl-equivalentClass relationindicates that these concepts are identical from a semantic andrepresentational perspective, and accordingly, the value of maf-callsignmay be immediately assigned to caf-callsign within the new schema. Asimilar semantic and representational relationship exists between themaf-eventStartTime concept and the caf-eventTime concept and between themaf-taskunit concept and the caf-taskunit concept, and the value of eachrespective MAF concepts may be immediately assigned to the correspondingCAF concept within the new schema.

The operations within the exemplary alignment block of FIG. 31 effectthe mappings between the MAF and CAF information system ontologies andhelp derive the new schema by defining the semantic, representational,and terminological matches between the concepts of the new schema andthe concepts of the MAF and CAF information system ontologies. Thenewly-generated schema also forms the basis for new information systemontology and a corresponding web service that integrate the MAF and CAFinformation systems. The operations within the exemplary alignment blockof FIG. 31 may also affect the MAF and CAF information systemontologies.

The exemplary SWSL script then orchestrates workflows from the MAF webservice to the CAF web service through the newly created web service. Inthe exemplary SWSL script, the workflows are specified using the “Step”functionality of SWSL. As an alternative, the workflow could have beenspecified in HTML rather than in scripting. The set of mapping relationsthat define the generation of the new schema are used to establishsimilar mappings between the new information system ontology, the MAFinformation system ontology, and the CAF information system ontology, aswas described above in reference to FIG. 29. These mappings then serveto reconcile syntactic, structural, and representational mismatchesbetween the MAF and the CAF.

Once workflows between concepts in the respective MAF and CAFinformation system ontologies (or alternatively, between concepts in therespective WSDL ontologies) have been orchestrated, any remainingterminological, semantic, and representational mismatches are resolvedwithin the mediation block. As an alternative, the mediation block couldhave been specified in HTML rather than in scripting. The resultinginformation system ontology incorporates the concepts from theinformation system and context ontologies, as well as the set ofmappings between concepts in the MAF and CAF information systemontologies. As such, the resulting information system ontology maycontain more data than is contained within the MAF and CAF informationsystem ontologies (or alternatively, the respective WSDL ontologies).Further, a web service and corresponding WSDL file or SAWSDL file may becreated for the new information system ontology.

The exemplary alignment block of the SWSL script of FIG. 31 has beendescribed in terms of a single mapping operation, i.e., the mapping ofconcepts in the MAF and CAF information systems using theowl:equivalentClass relation. In additional embodiments, the exemplaryalignment block of FIG. 31 may map concepts from the MAF and CAFinformation system ontologies using any number of mapping relations andtheir corresponding definitions in HTML. FIGS. 32( a) through 32(e)illustrate portions of an exemplary alignment block that employ HTML tomap together concepts within the MAF and CAF information systems using anumber of appropriate mapping relations.

FIG. 32( a) illustrates an exemplary HTML that employs owl-sameAs to mapconcepts within the MAF information system ontology to concepts withinthe CAF information system ontology. The owl-sameAs relation indicatesthat two concepts are identical from a semantic perspective, but exhibita terminological mismatch. For example, the use of owl-sameAs relationin FIG. 32( a) to map the MAF-F015 concept to the CAF-F15C conceptindicates that, while these concepts are semantically equivalent, aterminological mismatch exists between the MAF-F015 concept and theCAF-F15C concept.

An exemplary HTML that maps concepts within the MAF and CAF informationsystem ontologies using mappings-match is illustrated in FIG. 32( b).The mappings-match relation is equivalent to the hasMatch relationdescribed above in reference to Example 2, and the mappings-matchrelation indicates that two concepts are semantically identical, buthave different representations on the disparate information systemontologies. In FIG. 32( b), the mappings-match relation indicates thatthe maf-coord concept and the caf-coord concepts are semanticallyidentical, but have different a representation on the respective MAF andCAF information system. For example, the maf-coord concept on the MAFinformation system may be represented by the Eastern time zone, whilethe caf-coord concept on the CAF information system may be representedby the military Zulu time zone.

In FIG. 32( c), an exemplary portion of HTML employs mappings-hasContextto specify a contextual relationship between a concept within theinformation system ontology and a concept within the shared contextontology. In the example of FIG. 32( c), the mappings-hasContextrelation specifies that a contextual relationship exists between thecaf-coord concept and the Coord-GEODETIC-WGE concept within the positioncontext ontology.

FIG. 32( d) illustrates an exemplary portion of HTML that employsmappings-hasRelation to specify a generic relation between a pair ofconcepts within the same ontology or existing in different ontologies.In the example of FIG. 32( d), the use of the mappings-hasRelationrelation specifies that a generic relationship exists between thecoord-UTM-WGE concept and the WGE concept within the position contextontology.

The exemplary portion of HTML within FIG. 32( e) employs rdfs-subclassOfto indicate that one concept in an ontology is an instance of (or asubclass of) a second concept within the same or a different ontology.In the example of FIG. 32( e), the assertion of the rdfs-subclassOfrelation specifies that the MA-B052H concept is an instance of theMAF-AircraftType concept within the MAF information system ontology. Ina similar fashion, the CAF-F015 concept within the CAF informationsystem ontology is a subclass of the broader caf-AircraftType conceptwithin the CAF information system ontology.

The exemplary portions of the SWSL alignment block presented withinFIGS. 32( a)-32(e) employ HTML to align concepts within the MAF and CAFinformation system ontologies, and as such, these exemplary portions canrun in a conventional web browser. In alternate embodiments, the variousportions of the SWSL alignment block may be implemented in using any ofa number of open-source web-scripting languages, such a JScript,VBScript, and Ruby, or through the use of any combination ofweb-scripting languages and HTML coding that would be apparent to oneskilled in the art.

SWSL scripts, such as the exemplary scripts of FIGS. 30 and 31,facilitate the development of information system ontologies that linkdisparate data models using mappings between the disparate data models.Thus, SWSL scripts may serve the additional purpose of indexing andretrieving previously generated data models, as each SWSL scriptcontains not only a new data model, but also existing data models thatthe newly-generated data model serves to integrate.

Further, since SWSL scripts also reference context ontologies, orcontexts, and the mappings of data models to them, each of thereferenced context ontologies can serve as a specialized index. Forexample, the “time” and “position” context ontologies from the exemplaryscripts of FIGS. 30 and 31 can serve as time and position indices. Theindexing properties of SWSL scripts may capture important metrics suchas the number of mappings that connect a SWSL script to one or morecontext ontologies, or context, the number of mappings that connect twoor more data models, and the designation of the newly generated datamodels from the SWSL script. Data models may also be discovered byquerying the context ontology indices, and these queries may identifydata models that have the links to specific context ontologies and datamodels that correspond to SWSL scripts with links to other designateddata models.

Unlike established programming languages such as C or Pascal, the SWSLscripting paradigm does not explicitly embrace “strong typing.” However,the SWSL may support a number of object types, including:

-   -   Input Parameter Type; Output Parameter Type; Web Service Type;        Input Object Type; Output Object Type; Onto Mapper Type; Method        Name Type; Composite Type; Complex Type; Instance Value Type;        and Tag Type.

By embracing “weak typing,” SWSL scripts hide the above object typesfrom the script writer, and the SWSL interpreter engine converts thedata types in the SWSL script into the appropriate object typesdescribed above. However, the selection of the scripting language uponwhich the scripting paradigm is build dictates the handling of SWSL datatypes, and the present invention is not limited to the object types andto the “weak typing” standard described above.

The SWSL scripting paradigm also provides a script writer the ability towrite scripts that incorporate both concepts and metadata. SWSL scriptsmay define an instance of a concept within the script, and the resultingscript may treat an instance of a concept as a specified, assignablevalue. Further, SWSL scripts may support the specialization of a conceptwith a restriction (e.g., restrict the concept “Geodetic” with aspecified datum). SWSL scripts, when incorporating metadata, may providethe script writer the ability to tag an XML element or attribute with aconcept and the facility to load an XML element with its tag value fromthe SAWSDL file.

In addition, SWSL scripts may provide a script writer the option ofimporting previously-developed context ontologies (e.g., time, position,units of measure, etc.), and these imported ontologies may berepresented as context modules with one or more associated web services.SWSL scripts may additionally associate an imported context with aparticular concept or a variable. Furthermore, SWSL script may importexisting or other SWSL scripts.

FIG. 33 is a detailed illustration of an exemplary method 3300 forautomatically generating ontological structures necessary to integratedisparate information systems within the exemplary SWSL script of FIG.31. A shared information system ontology Z may be automaticallygenerated within FIG. 33 to integrate two disparate information systems,the Mobility Air Force Command (MAF) and the Combat Air Force Command(CAF), within the Department of Defense enterprise. The exemplary methodof FIG. 33 leverages and expands upon the techniques for generating theinformation system ontology Z described above in reference to FIG. 29.

Within FIG. 33, step 3302 provides ontologies for the CAF and MAFinformation systems in OWL or RDF format. In one embodiment, at leastone of the MAF and CAF ontologies may be generated from a correspondingWSDL file using a WSDL-to-OWL conversion tool or a custom-builtconversion tool. In an additional embodiment, at least one of the MAFand CAF ontologies may be imported directly in OWL or RDF format using aSWSL script, such as the exemplary script of FIG. 31.

Step 3304 provides a HTML file, or the alignment block of the SWSLscript, containing mapping relations that link concepts or classeswithin the CAF ontology, the MAF ontology, and any context ontology,such as the position and time context ontologies. The mapping relationswithin the HTML mapping file may be generated by a subject matter expert(SME) in the domain of integration, and in a preferred embodiment, theHTML mappings file may be imported by a script, such as the exemplarySWSL script described in FIG. 31. Further, the mapping relations withinthe SWSL file may incorporate a set of HTML alignment statements thatare invoked within the exemplary SWSL script of FIG. 31.

Information system ontology Z is then initialized with the CAF ontologyin step 3306. The initialization process may incorporate into theinformation system ontology Z those concepts and data models thatcharacterize the CAF ontology

In step 3308, a set of relations may be created between concepts orclasses within the CAF ontology and corresponding concepts or classeswithin the information system ontology Z. Within step 3308, a number ofowl:equivalentClass relations and has-match relations may be added tothe information system ontology Z. The selection of a particularrelation depends upon the structure of concepts within the CAF ontology(e.g., whether the concept is defined using triples and is not a leafnode) and mappings within the HTML mappings file (e.g., whether anymappings involve CAF concepts).

Once the concepts within the CAF ontology have been mapped tocorresponding concepts within the information system ontology Z, mappingrelations within the HTML mappings file, or the alignment block of theSWSL script, are added to the information system ontology Z in step3310. Within step 3310, the exemplary method may incorporate any numberof has-relation relations into the information system ontology Z. Aparticular has-relation relation may be identified from the HTMLmappings file that links a concept A within an ontology X, X#A, to asecond concept B within the same ontology, X#B. The ontology X may bethe MAF ontology, the CAF ontology, or any of a number of contextontologies, such as the position or time context ontologies, that arerelevant to the enterprise. A representation of concept A in theinformation system ontology Z, Z#A, may then be generated if it does notalready exist, and a representation of concept B within the informationsystem ontology Z, Z#B, may be generated if concept B exists within theMAF ontology. Once the various representations of concepts A and B havebeen generated in the information system ontology Z, the has-relationrelation is then added to Z#A with respect to X#B in step 3310. In asimilar fashion, the has-relation may be added to Z#B with respect toX#A. This process may be repeated for each has-relation relation withinthe HTML mappings file. In addition to the method described above, thehas-relation relation can simply create a relation between X#A and X#Band persists it in X.

Owl-equivalentClass relations within the HTML mappings file may also beincorporated into the information system ontology Z within step 3310. Inparticular, an owl-equivalentClass relation may be identified that linksa concept in the CAF ontology, CAF#A, with a concept in the MAFontology, MAF#B. The owl:equivalentClass relation may then be added toZ#A with respect to both X#B and X#A, where Z#A has been previouslygenerated using the techniques described above. This process may berepeated for each owl:equivalentClass relation within the HTML mappingsfile. In addition to the method described above the owl:equivalentClassmay require further processing than that described above.

Further, a number of sub-class-of relations may be added to theinformation system ontology Z within step 3310. For each sub-class-ofrelations between X#A and X#B, a corresponding representation Z#A iscreated within the information system ontology Z. Alternatively, therepresentation Z#A may have been previously generated using thetechniques described above. The rdfs:subClassOf relation may then beadded to the Z#A with respect to X#B. In addition to the methoddescribed above, the rdfs:subClassOf can simply create a subclassrelation between X#A and X#B and persist it in X.

A number of same-as relations may also be incorporated into theinformation system ontology Z within step 3310. For each same-asrelation between a concept in the MAF ontology, MAF#A, and a concept inthe CAF ontology, CAF#B, an owl:sameAs relation may be added to theinformation system ontology Z linking Z#A to Z#B. In a similar fashion,the owl:sameAs relation may be added to link Z#A to Z#B. In addition tothe method described above, the same-as relation may create a relationbetween X#A and X#B and thus, may require further processing.

Using the techniques of step 3310, each mapping relation within the HTMLmappings file, or the alignment block of the SWSL script, may beindividually added to the information system ontology Z to map theinformation system ontology Z to both the MAF ontology and the CAFontology. As the MAF and CAF ontologies have been previously mapped totheir associated context ontologies using the process described inExample 2, the information system ontology is also correctly mapped tothe set of context onto logies.

The mapped information system ontology Z that links MAF ontology, theCAF ontology, and the context ontologies is output in step 3312, and aweb service and a corresponding WSDL/SAWSDL file may then be generatedfrom the information system ontology Z and from the context ontologiesto which the information ontology Z is mapped. By creating anappropriate web service description file (WSDL), the web service may beinvoked to broker information requests between the source and targetinformation system and to facilitate data interoperability between theMAF and CAF information systems

FIG. 34 illustrates an exemplary method 3400 for generating a set ofrelations between concepts in a CAF ontology and an information systemontology Z that may be incorporated into step 3308 of the exemplarymethod of FIG. 33. Information system ontology Z is initialized with theconcepts and classes of the CAF ontology within step 3402, and aparticular concept or class within the CAF ontology, class A, isidentified within step 3404. The identification of the particularconcept within the CAF ontology is for exemplary purposes only, and theprocesses outlined in FIG. 34 may be applied to all concepts within theCAF ontology without departing from the spirit and scope of the presentinvention.

The identified concept, class A, within the CAF ontology may berepresented within the ontology by a uniform structure of triples, andin step 3406, the exemplary method determines whether class A is definedusing additional triples. If class A has no additional triples in itsclass definition, then class A is a leaf node, and theowl:equivalentClass relation is added to the information system ontologyZ in step 3408 to link class A on the information system ontology Z withclass A on the CAF ontology. Within FIG. 34, the representation of classA in the information system ontology Z is given by the notation Z#A, andthe representation of class A in the CAF ontology is denoted by CAF#A.

If the definition of class A contains additional triples, then class Ais not a leaf node, and the exemplary method then determines in step3410 whether any has-match relations within the HTML mappings fileinvolve class A. If the HTML mappings file includes a has-match relationinvolving class A, then the has-match relation is added to Z#A withrespect to CAF#A in step 3412, and the owl:equivalentClass relation isnot asserted.

If, however, the exemplary method determines in step 3410 that nohas-match relation in the HTML mappings file involve class A, thenneither owl:equivalentClass relation nor the has-match relation is addedis Z#A with respect to CAF#A in step 3414. In this case, no relation iscreated between the various representations of class A on the CAFontology and the information system Z ontology.

FIG. 35 illustrates a structure of the exemplary scripts described abovewith reference to FIGS. 30 and 31. In one embodiment, the structure maybe implemented in JavaScript, which allows the exemplary structure torun in the Web user's browser, or alternatively, the structure may beimplemented in HTML or any other appropriate scripting language.

In FIG. 35, a data store is implemented as an RDF repository. In oneembodiment, the RDF repository provides facilities for import, export,and management of RDF/OWL files. Closely associated with the data storeis an RDF/OWL parser that is used to import RDF/OWL files to the datastore. A path query component implements graph traversal techniques thatare used in various places in workflow composition. As described above,there are two types of querying: Direct Path Query (DPQ), and IncomingIntersection Query (IIQ). Both the DPQ and IIQ algorithms make use ofRDF- and OWL-Lite inferences, as described below.

In one embodiment, the DPQ is given a list of input values and a desiredoutput, the DPQ creates the set of all the direct paths that lead to thedesired output concept from the input concepts. In an additionalembodiment, the IIQ is also given a list of input values and a desiredoutput, and the IIQ performs a series of DPQs in the following manner.First, the algorithm uses a DPQ to create the set of all the directpaths, starting from any node in the graph, that lead to the desiredoutput concept. Second, for each input value, the algorithm uses a DPQto create the set of direct paths, starting from any node in the graph,that lead to that input value. Third, the algorithm returns theintersection of these sets.

The WMSL parser takes as input an HTML page that contains the mappingrelations and produces aligned ontologies. In one embodiment, the WMSLparser accomplishes the generation of the ontologies as follows:

-   -   1. For each of the WSDL file declared in the WMSL-profile, the        WMSL parser creates an ontology describing the type information        denoted in the type section of the WSDL file. The WMSL parser        also creates the Web Service Upper Ontology and automatically        links it to the previously mentioned ontology.    -   2. For each mapping relation in the WMSL document, the WMSL        parser creates a corresponding relationship between the        generated ontologies. If an entity is declared in the WMSL        document, but is not present in the ontology (composed of the        mapped ontolgies), the entity is created and inserted into the        ontology along with the specified relation(s).

To accomplish the step (1) above, the WMSL parser builds the ontology byleveraging the semantics already existing in the WSDL file. First,mapping patterns between XML schema primitives and the OWL/RDFvocabulary are created. In an embodiment, class membership is derivedfrom the XML schema sequence (xs: sequence) and from restrictions onproperties from the minOccurs/maxOccurs attributes of the xs:sequencetag. Since XML schema does not contain property names, a genericrelationship may be used in the RDF triple.

FIG. 35 also depicts a web service proxy that invokes a web service onthe Web. In one embodiment, the web service proxy works in conjunctionwith a proxy server installed at an application server to invoke webservices on the Web. This pattern is made necessary in order to overcomethe security issues that can arise when invoking Web services from theWeb browser. In the embodiment depicted in FIG. 35, a proxy server (notshown) is the only software infrastructure needed. The Web service proxybuilds the appropriate URL that invokes a Web service, for example, byassigning the correct values to the Web service parameters for a ReSTstyle service. After invoking the Web service, the Web service proxytranslates the instance returned by the Web service into an RDF instanceof the RDF schema generated from the service's WSDL. The Web serviceproxy subsequently stores the RDF instance in to the Data Store.

In FIG. 35, a mediator component translates the returned document of oneweb service into an instance that can be consumed by another Web servicein the workflow. In one embodiment, the mediator accomplishes thetranslation by interpreting the mapping relations in the alignedontology. Each of the components described above offers an interfacewhich permits its invocation, an collectively, these interfaces enablethe composition or the mashing up of Web services.

Exemplary Computer Architectures

FIG. 36 is an exemplary computer architecture upon which the methods,systems, and computer program products of the present invention may beimplemented, according to an embodiment of the invention. The exemplarycomputer system 3600 includes one or more processors, such as processor3602. The processor 3602 is connected to a communication infrastructure3606, such as a bus or network. Various example software implementationsare described in terms of this exemplary computer system. After readingthis description, it will become apparent to a person skilled in therelevant art how to implement the invention using other computer systemsand/or computer architectures.

Computer system 3600 also includes a main memory 3608, preferably randomaccess memory (RAM), and may include a secondary memory 3610. Thesecondary memory 3610 may include, for example, a hard disk drive 3612and/or a removable storage drive 3614, representing a magnetic tapedrive, an optical disk drive, CD/DVD drive, etc. The removable storagedrive 3614 reads from and/or writes to a removable storage unit 3618 ina well-known manner. Removable storage unit 3618 represents a magnetictape, optical disk, or other storage medium that is read by and writtento by removable storage drive 3614. As will be appreciated, theremovable storage unit 3618 can include a computer usable storage mediumhaving stored therein computer software and/or data.

In alternative implementations, secondary memory 3610 may include othermeans for allowing computer programs or other instructions to be loadedinto computer system 3600. Such means may include, for example, aremovable storage unit 3622 and an interface 3620. An example of suchmeans may include a removable memory chip (such as an EPROM, or PROM)and associated socket, or other removable storage units 3622 andinterfaces 3620, which allow software and data to be transferred fromthe removable storage unit 3622 to computer system 3600.

Computer system 3600 may also include one or more communicationsinterfaces, such as communications interface 3624. Communicationsinterface 3624 allows software and data to be transferred betweencomputer system 3600 and external devices. Examples of communicationsinterface 3624 may include a modem, a network interface (such as anEthernet card), a communications port, a PCMCIA slot and card, etc.Software and data transferred via communications interface 3624 are inthe form of signals 3628, which may be electronic, electromagnetic,optical or other signals capable of being received by communicationsinterface 3624. These signals 3628 are provided to communicationsinterface 3624 via a communications path (i.e., channel) 3626. Thischannel 3626 carries signals 3628 and may be implemented using wire orcable, fiber optics, an RF link and other communications channels. In anembodiment of the invention, signals 3628 comprise data packets sent toprocessor 3602. Information representing processed packets can also besent in the form of signals 3628 from processor 3602 throughcommunications path 3626.

The terms “computer program medium” and “computer usable medium” areused to refer generally to media such as removable storage units 3618and 3622, a hard disk installed in hard disk drive 3612, and signals3628, which provide software to the computer system 3600.

Computer programs are stored in main memory 3608 and/or secondary memory3610. Computer programs may also be received via communicationsinterface 3624. Such computer programs, when executed, enable thecomputer system 3600 to implement the present invention as discussedherein. In particular, the computer programs, when executed, enable theprocessor 3602 to implement the present invention. Where the inventionis implemented using software, the software may be stored in a computerprogram product and loaded into computer system 3600 using removablestorage drive 3614, hard drive 3612 or communications interface 3624.

CONCLUSION

It is to be appreciated that the Detailed Description section, and notthe Summary and Abstract sections, is intended to be used to interpretthe claims. The Summary and Abstract sections may set forth one or morebut not all exemplary embodiments of the present invention ascontemplated by the inventor(s), and thus, are not intended to limit thepresent invention and the appended claims in any way.

The present invention has been described above with the aid offunctional building blocks illustrating the implementation of specifiedfunctions and relationships thereof. The boundaries of these functionalbuilding blocks have been arbitrarily defined herein for the convenienceof the description. Alternate boundaries can be defined so long as thespecified functions and relationships thereof are appropriatelyperformed.

The foregoing description of the specific embodiments will so fullyreveal the general nature of the invention that others can, by applyingknowledge within the skill of the art, readily modify and/or adapt forvarious applications such specific embodiments, without undueexperimentation, without departing from the general concept of thepresent invention. Therefore, such adaptations and modifications areintended to be within the meaning and range of equivalents of thedisclosed embodiments, based on the teaching and guidance presentedherein. It is to be understood that the phraseology or terminologyherein is for the purpose of description and not of limitation, suchthat the terminology or phraseology of the present specification is tobe interpreted by the skilled artisan in light of the teachings andguidance.

The breadth and scope of the present invention should not be limited byany of the above-described exemplary embodiments, but should be definedonly in accordance with the following claims and their equivalents.

1. A method for automatically integrating data across disparate information systems, comprising: developing a first set of mapping relations that map entities in a source structured data model to corresponding entities in a target structured data model; developing a second set of mapping relations that map entities in the source structured data model and entities in the target structured data model to one or more shared contexts; processing the first and second sets of mapping relations to generate a new structured data model; generating executable code that publishes the new structured data model to implement workflow between the source information system and the destination information system.
 2. The method of claim 1, wherein the generating step comprises: executing source information to retrieve an instance; translating the source instance into an instance that conforms to the new structured data model; and invoking a destination service to populate the instance conforming to the new structured data model.
 3. The method of claim 2, wherein the translating step comprises: invoking services corresponding to the shared context to mediate mismatches between entities in the source structured data model and entities in the target structured data model.
 4. The method of claim 3, wherein the invoking step comprises: mediating at least one of a structural mismatch, a syntactic mismatch, and a representational mismatch between entities in the source structured data model and entities in the target structured data model.
 5. The method of claim 1, further comprising: capturing the first set of mapping relations and the second set of mapping relations in one or more of (i) text format, (ii) HTML format, and (iii) spreadsheet format.
 6. The method of claim 1, further comprising: processing the first set of mapping relations and the second set of mapping relations to generate a schema.
 7. The method of claim 1, wherein the source structured data model, the target structured data model, and the new structured data model are ontologies.
 8. The method of claim 1, wherein the source information system and the target information system comprise any combination of a database and a web site.
 9. A computer program product comprising a computer useable medium having computer program logic recorded thereon for enabling a processor automatically integrating data across disparate information systems, comprising: means for enabling a processor to develop a first set of mapping relations that map entities in a source structured data model to corresponding entities in a target structured data model; means for enabling a processor to develop a second set of mapping relations that map entities in the source structured data model and entities in the target structured data model to one or more shared contexts; means for enabling a processor to process the first and second sets of mapping relations to generate new structured data model; means for enabling a processor to generate executable code that publishes the new structured data model to implement workflow between the source information system and the destination information system.
 10. The method of claim 9, wherein the means for enabling a processor to generate comprises: means for enabling a processor to execute source information to retrieve an instance; means for enabling a processor to translate the source instance into an instance that conforms to the new structured data model; and means for enabling a processor to invoke a destination service to populate the instance conforming to the new structured data model.
 11. The method of claim 10, wherein the means for enabling a processor to translate step comprises: means for enabling a processor to invoke services corresponding to the shared context to mediate mismatches between entities in the source structured data model and entities in the target structured data model.
 12. The method of claim 11, wherein the means for enabling a processor to invoke comprises: means for enabling a processor to mediate at least one of a structural mismatch, a syntactic mismatch, and a representational mismatch between entities in the source structured data model and entities in the target structured data model.
 13. The method of claim 9, further comprising: means for enabling a processor to capture the first set of mapping relations and the second set of mapping relations in one or more of (i) text format, (ii) HTML format, and (iii) spreadsheet format.
 14. The method of claim 9, further comprising: means for enabling a processor to process the first set of mapping relations and the second set of mapping relations to generate a schema.
 15. The method of claim 9, wherein the source structured data model, the target structured data model, and the new structured data model are ontologies.
 16. The method of claim 9, wherein the source information system and the target information system comprise any combination of a database and a web site. 